$\newcommand{\xv}{\mathbf{x}} \newcommand{\Xv}{\mathbf{X}} \newcommand{\yv}{\mathbf{y}} \newcommand{\Yv}{\mathbf{Y}} \newcommand{\zv}{\mathbf{z}} \newcommand{\av}{\mathbf{a}} \newcommand{\Wv}{\mathbf{W}} \newcommand{\wv}{\mathbf{w}} \newcommand{\betav}{\mathbf{\beta}} \newcommand{\gv}{\mathbf{g}} \newcommand{\Hv}{\mathbf{H}} \newcommand{\dv}{\mathbf{d}} \newcommand{\Vv}{\mathbf{V}} \newcommand{\vv}{\mathbf{v}} \newcommand{\Uv}{\mathbf{U}} \newcommand{\uv}{\mathbf{u}} \newcommand{\tv}{\mathbf{t}} \newcommand{\Tv}{\mathbf{T}} \newcommand{\Sv}{\mathbf{S}} \newcommand{\Gv}{\mathbf{G}} \newcommand{\zv}{\mathbf{z}} \newcommand{\Zv}{\mathbf{Z}} \newcommand{\Norm}{\mathcal{N}} \newcommand{\muv}{\boldsymbol{\mu}} \newcommand{\sigmav}{\boldsymbol{\sigma}} \newcommand{\phiv}{\boldsymbol{\phi}} \newcommand{\Phiv}{\boldsymbol{\Phi}} \newcommand{\Sigmav}{\boldsymbol{\Sigma}} \newcommand{\Lambdav}{\boldsymbol{\Lambda}} \newcommand{\half}{\frac{1}{2}} \newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}} \newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}} \newcommand{\dimensionbar}[1]{\underset{#1}{\operatorname{|}}} \newcommand{\grad}{\mathbf{\nabla}} \newcommand{\ebx}[1]{e^{\betav_{#1}^T \xv_n}} \newcommand{\eby}[1]{e^{y_{n,#1}}} \newcommand{\Tiv}{\mathbf{Ti}} \newcommand{\Fv}{\mathbf{F}} \newcommand{\ones}[1]{\mathbf{1}_{#1}}$

27 Nonlinear Dimensionality Reduction with Digits Example

Principal Components Analysis (PCA)


In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import gzip
import pickle

In [2]:
!wget http://www.cs.colostate.edu/~anderson/cs480/notebooks/mnist.pkl.gz


--2017-04-18 08:51:54--  http://www.cs.colostate.edu/~anderson/cs480/notebooks/mnist.pkl.gz
Resolving www.cs.colostate.edu... 129.82.45.114
Connecting to www.cs.colostate.edu|129.82.45.114|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16168813 (15M) [application/x-gzip]
Saving to: “mnist.pkl.gz.2”

100%[======================================>] 16,168,813  --.-K/s   in 0.1s    

2017-04-18 08:51:54 (106 MB/s) - “mnist.pkl.gz.2” saved [16168813/16168813]


In [3]:
with gzip.open('mnist.pkl.gz', 'rb') as f:
    train_set, valid_set, test_set = pickle.load(f, encoding='latin1')
    # zero = train_set[0][1,:].reshape((28,28,1))
    # one = train_set[0][3,:].reshape((28,28,1))
    # two = train_set[0][5,:].reshape((28,28,1))
    # four = train_set[0][20,:].reshape((28,28,1))

X = train_set[0]
T = train_set[1].reshape((-1,1))

Xtest = test_set[0]
Ttest = test_set[1].reshape((-1,1))

X.shape, Xtest.shape


Out[3]:
((50000, 784), (10000, 784))

In [4]:
np.linalg.svd?

In [5]:
Xn = X - np.mean(X,axis=0)
U,S,V = np.linalg.svd(Xn, full_matrices=False)
V = V.T
V.shape


Out[5]:
(784, 784)

In [6]:
S.shape


Out[6]:
(784,)

In [7]:
plt.plot(S)
plt.ylabel('Singular values')
plt.xlabel('Index')


Out[7]:
<matplotlib.text.Text at 0x7f95ea0544a8>

In [8]:
svs = [0,1]
Xproj = np.dot(Xn,V[:,svs])
plt.figure(figsize=(10,10))
for i in range(1000): #Xproj.shape[0]):
    #plt.plot(Xproj[i,0], Xproj[i,1],'.')
    plt.annotate(T[i,0],Xproj[i,:2],horizontalalignment='center',
        verticalalignment='center')
plt.xlim(np.min(Xproj[:,0]), np.max(Xproj[:,0]))
plt.ylim(np.min(Xproj[:,1]), np.max(Xproj[:,1]));


Nonlinear Dimensionality Reduction with Bottleneck Network


In [9]:
import neuralnetworksbylayer as nn

In [10]:
nnet = nn.NeuralNetwork([784,50,20,10,2,10,20,50,784])
nnet.train(X,X,nIterations=5000,verbose=True)


SCG: Iteration 500 ObjectiveF=0.62465 Scale=1.000e-15 Time=0.17470 s/iter
SCG: Iteration 1000 ObjectiveF=0.60363 Scale=1.000e-15 Time=0.19397 s/iter
SCG: Iteration 1500 ObjectiveF=0.59337 Scale=1.000e-15 Time=0.19373 s/iter
SCG: Iteration 2000 ObjectiveF=0.58750 Scale=1.000e-15 Time=0.19290 s/iter
SCG: Iteration 2500 ObjectiveF=0.58383 Scale=1.000e-15 Time=0.19208 s/iter
SCG: Iteration 3000 ObjectiveF=0.58079 Scale=1.000e-15 Time=0.19406 s/iter
SCG: Iteration 3500 ObjectiveF=0.57809 Scale=1.000e-15 Time=0.19260 s/iter
SCG: Iteration 4000 ObjectiveF=0.57579 Scale=1.000e-15 Time=0.19445 s/iter
SCG: Iteration 4500 ObjectiveF=0.57381 Scale=1.000e-15 Time=0.19284 s/iter
SCG: Iteration 5000 ObjectiveF=0.57192 Scale=1.000e-15 Time=0.19411 s/iter
Out[10]:
NeuralNetwork(TanhLayer(784,50),
              TanhLayer(50,20),
              TanhLayer(20,10),
              TanhLayer(10,2),
              TanhLayer(2,10),
              TanhLayer(10,20),
              TanhLayer(20,50),
              LinearLayer(50,784)
   Network was trained for 5000 iterations. Final error is 0.571919354272139.

In [12]:
plt.plot(nnet.getErrorTrace())


Out[12]:
[<matplotlib.lines.Line2D at 0x7f95cdcb7198>]

In [13]:
nnet.layers


Out[13]:
[TanhLayer(784,50),
 TanhLayer(50,20),
 TanhLayer(20,10),
 TanhLayer(10,2),
 TanhLayer(2,10),
 TanhLayer(10,20),
 TanhLayer(20,50),
 LinearLayer(50,784)]

In [16]:
bottle = nnet.layers[3].Y

In [17]:
bottle.shape


Out[17]:
(50000, 2)

In [19]:
plt.figure(figsize=(10,10))
for i in range(1000): #Xproj.shape[0]):
    #plt.plot(Xproj[i,0], Xproj[i,1],'.')
    plt.annotate(T[i,0],bottle[i,:2],horizontalalignment='center',
        verticalalignment='center')
plt.xlim( np.min(bottle[:,0]), np.max(bottle[:,0]))
plt.ylim(np.min(bottle[:,1]), np.max(bottle[:,1]));



In [23]:
plt.figure(figsize=(10,10))
for i in range(1000): #Xproj.shape[0]):
    #plt.plot(Xproj[i,0], Xproj[i,1],'.')
    plt.annotate(T[i,0],bottle[i,:2],horizontalalignment='center',
        verticalalignment='center')
plt.xlim(-1.,-0.8)
plt.ylim(-1., -0.4)


Out[23]:
(-1.0, -0.4)

Let's see where the test data is projected.


In [24]:
ytest = nnet.use(Xtest)

In [25]:
bottle = nnet.layers[3].Y
plt.figure(figsize=(10,10))
for i in range(1000): #Xproj.shape[0]):
    #plt.plot(Xproj[i,0], Xproj[i,1],'.')
    plt.annotate(Ttest[i,0],bottle[i,:2],horizontalalignment='center',
        verticalalignment='center')
plt.xlim( np.min(bottle[:,0]), np.max(bottle[:,0]))
plt.ylim(np.min(bottle[:,1]), np.max(bottle[:,1]));


Now what are all of those samples doing in the lower left corner? Let's draw images for the samples in the most lower left, so ones for which the two bottleneck units produce the most negative outputs.


In [27]:
np.argsort?

In [28]:
sqDistFromOrigin = np.sum(bottle**2, axis=1)
ordered = np.argsort(-sqDistFromOrigin)

In [29]:
ordered.shape


Out[29]:
(10000,)

In [30]:
ordered[:5]


Out[30]:
array([8284,  124, 3308, 4742, 4225])

In [31]:
bottle[ordered[:5],:]


Out[31]:
array([[-0.97689315, -0.97889975],
       [-0.97653481, -0.97670006],
       [-0.96823662, -0.97963459],
       [-0.98368014, -0.96124719],
       [-0.98389871, -0.95786954]])

In [34]:
for i in range(100):
    plt.subplot(10,10,i+1)
    plt.imshow(-Xtest[ordered[i],:].reshape((28,28)), interpolation='nearest', cmap='gray')
    plt.axis('off')